A quantitative analysis of global gazetteers: Patterns of coverage for common feature types
نویسندگان
چکیده
Article history: Received 21 September 2016 Received in revised form 28 January 2017 Accepted 13 March 2017 Available online xxxx Gazetteers are important tools used in a wide variety of workflows that depend on linking natural language text to geographical space. The spatial properties of these data sources, such as coverage, balance, and completeness, affect the performance of common tasks such as geoparsing and geocoding. However, little attention has focused onhow these properties vary in global gazetteers, particularly across country boundaries and according to feature types. In this paper, we present a detailed investigation of the spatial properties of two open gazetteers with worldwide coverage: GeoNames, and the Getty Thesaurus of Geographic Names (TGN). Using point density maps, correlations, and linear regressions, we analyze the global spatial coverage of each data source for the full set of features and for top feature types: populated places, streams, mountains, and hills. Results show wide discrepancies in coverage between the two datasets, sharp changes in feature type coverage across country borders, and idiosyncratic patterns dominated by a few countries for themore sparsely covered natural features. As more and more systems rely on recognizing and grounding named places, these patterns can influence the analysis of growing amounts of online text content and reinforce or amplify existing inequalities. © 2017 The Author. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http:// creativecommons.org/licenses/by-nc-nd/4.0/).
منابع مشابه
Spatial signatures for geographic feature types: examining gazetteer ontologies using spatial statistics
Digital gazetteers play a key role in modern information systems and infrastructures. They facilitate (spatial) search, deliver contextual information to recommender systems, enrich textual information with geographical references, and provide stable identifiers to interlink actors, events, and objects by the places they interact with. Hence, it is unsurprising that gazetteers, such as GeoNames...
متن کاملCommon Spatial Patterns Feature Extraction and Support Vector Machine Classification for Motor Imagery with the SecondBrain
Recently, a large set of electroencephalography (EEG) data is being generated by several high-quality labs worldwide and is free to be used by all researchers in the world. On the other hand, many neuroscience researchers need these data to study different neural disorders for better diagnosis and evaluating the treatment. However, some format adaptation and pre-processing are necessary before ...
متن کاملComparative Effectiveness of Semantic Feature Analysis (SFA) and Phonological Components Analysis (PCA) for Anomia Treatment in Persian Speaking Patients With Aphasia
Objectives: Anomia is one of the most common and persistent symptoms of aphasia. Although treatments of anomia usually focus on semantic and/or phonological levels, which both have been demonstrated to be effective, the relationship between the underlying functional deficit in naming and response to a particular treatment approach remains unclear. The aim of this study was to determine the rela...
متن کاملAutomatic Prostate Cancer Segmentation Using Kinetic Analysis in Dynamic Contrast-Enhanced MRI
Background: Dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) provides functional information on the microcirculation in tissues by analyzing the enhancement kinetics which can be used as biomarkers for prostate lesions detection and characterization.Objective: The purpose of this study is to investigate spatiotemporal patterns of tumors by extracting semi-quantitative as well as w...
متن کاملGlobal Coverage of Pneumococcal Conjugate Vaccine (PCV) and Serotype Distribution after Receiving Vaccine among Targeted PCV Vaccine Countries: A Systematic Review
Background and Objectives: After the introduction of the pneumococcal vaccine, an increase has been observed in the disease due to nonspecific stereotypes of the vaccine. This study was conducted to determine the spatial distribution of pneumococcal vaccine coverage and common stereotypes of streptococcus pneumonia after vaccine introduction in the vaccine recipient countries. Methods: The ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computers, Environment and Urban Systems
دوره 64 شماره
صفحات -
تاریخ انتشار 2017